An Integrated Mass-Spectrometry Pipeline Identifies Novel Protein Coding-Regions in the Human Genome

نویسندگان

  • Danny A. Bitton
  • Duncan L. Smith
  • Yvonne Connolly
  • Paul J. Scutt
  • Crispin J. Miller
چکیده

BACKGROUND Most protein mass spectrometry (MS) experiments rely on searches against a database of known or predicted proteins, limiting their ability as a gene discovery tool. RESULTS Using a search against an in silico translation of the entire human genome, combined with a series of annotation filters, we identified 346 putative novel peptides [False Discovery Rate (FDR)<5%] in a MS dataset derived from two human breast epithelial cell lines. A subset of these were then successfully validated by a different MS technique. Two of these correspond to novel isoforms of Heterogeneous Ribonuclear Proteins, while the rest correspond to novel loci. CONCLUSIONS MS technology can be used for ab initio gene discovery in human data, which, since it is based on different underlying assumptions, identifies protein-coding genes not found by other techniques. As MS technology continues to evolve, such approaches will become increasingly powerful.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Long non-coding RNAs and their significance in human diseases

Protein-coding genes account for only a small fraction of the human genome and most of the genomic sequences are transcriptionally silent, but recent observations indicate significant functional elements, including non-coding protein transcripts in the human genome. Long non-coding RNAs (lncRNAs) have been defined as transcripts of >200 nucleotides without protein-coding capacity that perform t...

متن کامل

The influence of transcript assembly on the proteogenomics discovery of microproteins

Proteogenomics methods have identified many non-annotated protein-coding genes in the human genome. Many of the newly discovered protein-coding genes encode peptides and small proteins, referred to collectively as microproteins. Microproteins are produced through ribosome translation of small open reading frames (smORFs). The discovery of many smORFs reveals a blind spot in traditional gene-fin...

متن کامل

Improving gene annotation using peptide mass spectrometry.

Annotation of protein-coding genes is a key goal of genome sequencing projects. In spite of tremendous recent advances in computational gene finding, comprehensive annotation remains a challenge. Peptide mass spectrometry is a powerful tool for researching the dynamic proteome and suggests an attractive approach to discover and validate protein-coding genes. We present algorithms to construct a...

متن کامل

P-130: Designing and Construction of An Appropriate Eukaryotic Expression Vector to Generate Soluble Form of Human Hyaluronidase Type PH20 in Cell Culture Feasible for Application in IVF and ICSI

Background: The hyaluronidases are the enzymes hydrolyze β-1, 4 glycosidic linkage of hyaluronan. Hyaluronan is a polymer consisting of a repeating disaccharide unit found in cumulus ovuforus complex, semen liquid and other tissue. Addition to hydrolyzing the hyaluronan, hyaluronidase can penetrate through the cumulus cells layer that surrounds the oocyte, thus it terms spreading factor. Moreov...

متن کامل

A Massively Parallel Pipeline to Clone DNA Variants and Examine Molecular Phenotypes of Human Disease Mutations

Understanding the functional relevance of DNA variants is essential for all exome and genome sequencing projects. However, current mutagenesis cloning protocols require Sanger sequencing, and thus are prohibitively costly and labor-intensive. We describe a massively-parallel site-directed mutagenesis approach, "Clone-seq", leveraging next-generation sequencing to rapidly and cost-effectively ge...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2010